52 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
Rights operator: and / or
2024 Journal article Open Access OPEN
Deep learning and structural health monitoring: a TFT-based approach for anomaly detection in masonry towers
Falchi F., Girardi M., Gurioli G., Messina N., Padovani C., Pellegrini D.
Detecting anomalies in the vibrational features of age-old buildings is crucial within the Structural Health Monitoring (SHM) framework. The SHM techniques can leverage information from onsite measurements and environmental sources to identify the dynamic properties (such as the frequencies) of the monitored structure, searching for possible deviations or unusual behavior over time. In this paper, the Temporal Fusion Transformer (TFT) network, a deep learning algorithm initially designed for multi-horizon time series forecasting and tested on electricity, traffic, retail, and volatility problems, is applied to SHM. The TFT approach is adopted to investigate the behavior of the Guinigi Tower located in Lucca (Italy) and subjected to a long-term dynamic monitoring campaign. The TFT network is trained on the tower's experimental frequencies enriched with other environmental parameters. The transformer is then employed to predict the vibrational features (natural frequencies, root mean squares values of the velocity time series) and detect possible anomalies or unexpected events by inspecting how much the actual frequencies deviate from the predicted ones. The TFT technique is used to detect the effects of the Viareggio earthquake that occurred on 6 February 2022, and the structural damage induced by three simulated damage scenarios.Source: Social Science Research Network (2024). doi:10.2139/ssrn.4679906
DOI: 10.2139/ssrn.4679906
Metrics:


See at: ISTI Repository Open Access | papers.ssrn.com Open Access | CNR ExploRA


2024 Journal article Open Access OPEN
Cascaded transformer-based networks for Wikipedia large-scale image-caption matching
Messina N., Coccomini D. A., Esuli A., Falchi F.
With the increasing importance of multimedia and multilingual data in online encyclopedias, novel methods are needed to fill domain gaps and automatically connect different modalities for increased accessibility. For example,Wikipedia is composed of millions of pages written in multiple languages. Images, when present, often lack textual context, thus remaining conceptually floating and harder to find and manage. In this work, we tackle the novel task of associating images from Wikipedia pages with the correct caption among a large pool of available ones written in multiple languages, as required by the image-caption matching Kaggle challenge organized by theWikimedia Foundation.Asystem able to perform this task would improve the accessibility and completeness of the underlying multi-modal knowledge graph in online encyclopedias. We propose a cascade of two models powered by the recent Transformer networks able to efficiently and effectively infer a relevance score between the query image data and the captions. We verify through extensive experiments that the proposed cascaded approach effectively handles a large pool of images and captions while maintaining bounded the overall computational complexity at inference time.With respect to other approaches in the challenge leaderboard,we can achieve remarkable improvements over the previous proposals (+8% in nDCG@5 with respect to the sixth position) with constrained resources. The code is publicly available at https://tinyurl.com/wiki-imcap.Source: Multimedia tools and applications (2024). doi:10.1007/s11042-023-17977-0
DOI: 10.1007/s11042-023-17977-0
Project(s): AI4Media via OpenAIRE
Metrics:


See at: link.springer.com Open Access | ISTI Repository Open Access | CNR ExploRA


2023 Conference article Open Access OPEN
CrowdSim2: an open synthetic benchmark for object detectors
Foszner P., Szczesna A., Ciampi L., Messina N., Cygan A., Bizon B., Cogiel M., Golba D., Macioszek E., Staniszewski M.
Data scarcity has become one of the main obstacles to developing supervised models based on Artificial Intelligence in Computer Vision. Indeed, Deep Learning-based models systematically struggle when applied in new scenarios never seen during training and may not be adequately tested in non-ordinary yet crucial real-world situations. This paper presents and publicly releases CrowdSim2, a new synthetic collection of images suitable for people and vehicle detection gathered from a simulator based on the Unity graphical engine. It consists of thousands of images gathered from various synthetic scenarios resembling the real world, where we varied some factors of interest, such as the weather conditions and the number of objects in the scenes. The labels are automatically collected and consist of bounding boxes that precisely localize objects belonging to the two object classes, leaving out humans from the annotation pipeline. We exploited this new benchmark as a testing ground for some state-of-the-art detectors, showing that our simulated scenarios can be a valuable tool for measuring their performances in a controlled environment.Source: VISIGRAPP 2023 - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 676–683, Lisbon, Portugal, 19-21/02/2023
DOI: 10.5220/0011692500003417
Project(s): AI4Media via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | www.scitepress.org Restricted | CNR ExploRA


2023 Conference article Open Access OPEN
Development of a realistic crowd simulation environment for fine-grained validation of people tracking methods
Foszner P., Szczesna A., Ciampi L., Messina N., Cygan A., Bizon B., Cogiel M., Golba D., Macioszek E., Staniszewski M.
Generally, crowd datasets can be collected or generated from real or synthetic sources. Real data is generated by using infrastructure-based sensors (such as static cameras or other sensors). The use of simulation tools can significantly reduce the time required to generate scenario-specific crowd datasets, facilitate data-driven research, and next build functional machine learning models. The main goal of this work was to develop an extension of crowd simulation (named CrowdSim2) and prove its usability in the application of people-tracking algorithms. The simulator is developed using the very popular Unity 3D engine with particular emphasis on the aspects of realism in the environment, weather conditions, traffic, and the movement and models of individual agents. Finally, three methods of tracking were used to validate generated dataset: IOU-Tracker, Deep-Sort, and Deep-TAMA.Source: VISIGRAPP 2023 - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 222–229, Lisbon, Portugal, 19-21/02/2023
DOI: 10.5220/0011691500003417
Project(s): AI4Media via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | www.scitepress.org Restricted | CNR ExploRA


2023 Conference article Restricted
Improving query and assessment quality in text-based interactive video retrieval evaluation
Bailer W., Arnold R., Benz V., Coccomini D., Gkagkas A., Þór Guðmundsson G., Heller S., Þór Jónsson B., Lokoc J., Messina N., Pantelidis N., Wu J.
Different task interpretations are a highly undesired element in interactive video retrieval evaluations. When a participating team focuses partially on a wrong goal, the evaluation results might become partially misleading. In this paper, we propose a process for refining known-item and open-set type queries, and preparing the assessors that judge the correctness of submissions to open-set queries. Our findings from recent years reveal that a proper methodology can lead to objective query quality improvements and subjective participant satisfaction with query clarity.Source: ICMR '23: International Conference on Multimedia Retrieval, pp. 597–601, Thessaloniki, Greece, 12-15/06/2023
DOI: 10.1145/3591106.3592281
Project(s): AI4Media via OpenAIRE
Metrics:


See at: dl.acm.org Restricted | CNR ExploRA


2023 Conference article Open Access OPEN
Text-to-motion retrieval: towards joint understanding of human motion data and natural language
Messina N., Sedmidubsk'y J., Falchi F., Rebok T.
Due to recent advances in pose-estimation methods, human motion can be extracted from a common video in the form of 3D skeleton sequences. Despite wonderful application opportunities, effective and efficient content-based access to large volumes of such spatio-temporal skeleton data still remains a challenging problem. In this paper, we propose a novel content-based text-to-motion retrieval task, which aims at retrieving relevant motions based on a specified natural-language textual description. To define baselines for this uncharted task, we employ the BERT and CLIP language representations to encode the text modality and successful spatio-temporal models to encode the motion modality. We additionally introduce our transformer-based approach, called Motion Transformer (MoT), which employs divided space-time attention to effectively aggregate the different skeleton joints in space and time. Inspired by the recent progress in text-to-image/video matching, we experiment with two widely-adopted metric-learning loss functions. Finally, we set up a common evaluation protocol by defining qualitative metrics for assessing the quality of the retrieved motions, targeting the two recently-introduced KIT Motion-Language and HumanML3D datasets. The code for reproducing our results is available here: https://github.com/mesnico/text-to-motion-retrieval.Source: SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2420–2425, Taipei, Taiwan, 23-27/07/2023
DOI: 10.1145/3539618.3592069
Project(s): AI4Media via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2023 Journal article Open Access OPEN
Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS
Lokoc J., Andreadis S., Bailer W., Duane A., Gurrin C., Ma Z., Messina N., Nguyen T. N., Peska L., Rossetto L., Sauter L., Schall K., Schoeffmann K., Khan O. S., Spiess F., Vadicamo L., Vrochidis S.
This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is presented in connection with an analysis of the performance of participating teams. Specifically, both high-level performance indicators are presented with overall statistics as well as in-depth analysis of the performance of selected tools implementing result set logging. The analysis reveals evidence that the CLIP model represents a versatile tool for cross-modal video retrieval when combined with interactive search capabilities. Furthermore, the analysis investigates the effect of different users and text query properties on the performance in search tasks. Last but not least, lessons learned from search task preparation are presented, and a new direction for ad-hoc search based tasks at Video Browser Showdown is introduced.Source: Multimedia systems (2023). doi:10.1007/s00530-023-01143-5
DOI: 10.1007/s00530-023-01143-5
Project(s): AI4Media via OpenAIRE, XRECO via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | ZENODO Open Access | link.springer.com Restricted | CNR ExploRA


2023 Conference article Open Access OPEN
VISIONE: a large-scale video retrieval system with advanced search functionalities
Amato G., Bolettieri P., Carrara F., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
VISIONE is a large-scale video retrieval system that integrates multiple search functionalities, including free text search, spatial color and object search, visual and semantic similarity search, and temporal search. The system leverages cutting-edge AI technology for visual analysis and advanced indexing techniques to ensure scalability. As demonstrated by its runner-up position in the 2023 Video Browser Showdown competition, VISIONE effectively integrates these capabilities to provide a comprehensive video retrieval solution. A system demo is available online, showcasing its capabilities on over 2300 hours of diverse video content (V3C1+V3C2 dataset) and 12 hours of highly redundant content (Marine dataset). The demo can be accessed at https://visione.isti.cnr.itSource: ICMR '23: International Conference on Multimedia Retrieval, pp. 649–653, Thessaloniki, Greece, 12-15/06/2023
DOI: 10.1145/3591106.3592226
Project(s): AI4Media via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2023 Conference article Open Access OPEN
VISIONE at Video Browser Showdown 2023
Amato G., Bolettieri P., Carrara F., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
In this paper, we present the fourth release of VISIONE, a tool for fast and effective video search on a large-scale dataset. It includes several search functionalities like text search, object and color-based search, semantic and visual similarity search, and temporal search. VISIONE uses ad-hoc textual encoding for indexing and searching video content, and it exploits a full-text search engine as search backend. In this new version of the system, we introduced some changes both to the current search techniques and to the user interface.Source: MMM 2023 - 29th International Conference on Multi Media Modeling, pp. 615–621, Bergen, Norway, 9-12/01/2023
DOI: 10.1007/978-3-031-27077-2_48
Project(s): AI4Media via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | ZENODO Open Access | CNR ExploRA


2023 Conference article Open Access OPEN
MC-GTA: a synthetic benchmark for multi-camera vehicle tracking
Ciampi L., Messina N., Valenti G. E., Amato G., Falchi F., Gennaro C.
Multi-camera vehicle tracking (MCVT) aims to trace multiple vehicles among videos gathered from overlapping and non-overlapping city cameras. It is beneficial for city-scale traffic analysis and management as well as for security. However, developing MCVT systems is tricky, and their real-world applicability is dampened by the lack of data for training and testing computer vision deep learning-based solutions. Indeed, creating new annotated datasets is cumbersome as it requires great human effort and often has to face privacy concerns. To alleviate this problem, we introduce MC-GTA - Multi Camera Grand Tracking Auto, a synthetic collection of images gathered from the virtual world provided by the highly-realistic Grand Theft Auto 5 (GTA) video game. Our dataset has been recorded from several cameras recording urban scenes at various crossroads. The annotations, consisting of bounding boxes localizing the vehicles with associated unique IDs consistent across the video sources, have been automatically generated by interacting with the game engine. To assess this simulated scenario, we conduct a performance evaluation using an MCVT SOTA approach, showing that it can be a valuable benchmark that mitigates the need for real-world data. The MC-GTA dataset and the code for creating new ad-hoc custom scenarios are available at https://github.com/GaetanoV10/GT5-Vehicle-BB.Source: ICIAP 2023 - 22nd International Conference on Image Analysis and Processing, pp. 316–327, Udine, Italy, 11-15/09/2023
DOI: 10.1007/978-3-031-43148-7_27
Project(s): AI4Media via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | link.springer.com Restricted | CNR ExploRA


2023 Conference article Open Access OPEN
AIMH Lab 2022 activities for Vision
Ciampi L., Amato G., Bolettieri P., Carrara F., Di Benedetto M., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
The explosion of smartphones and cameras has led to a vast production of multimedia data. Consequently, Artificial Intelligence-based tools for automatically understanding and exploring these data have recently gained much attention. In this short paper, we report some activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR, tackling some challenges in the field of Computer Vision for the automatic understanding of visual data and for novel interactive tools aimed at multimedia data exploration. Specifically, we provide innovative solutions based on Deep Learning techniques carrying out typical vision tasks such as object detection and visual counting, with particular emphasis on scenarios characterized by scarcity of labeled data needed for the supervised training and on environments with limited power resources imposing miniaturization of the models. Furthermore, we describe VISIONE, our large-scale video search system designed to search extensive multimedia databases in an interactive and user-friendly manner.Source: Ital-IA 2023, pp. 538–543, Pisa, Italy, 29-31/05/2023
Project(s): AI4Media via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA


2023 Conference article Open Access OPEN
AIMH Lab approaches for deepfake detection
Coccomini D. A., Caldelli R., Esuli A., Falchi F., Gennaro C., Messina N., Amato G.
The creation of highly realistic media known as deepfakes has been facilitated by the rapid development of artificial intelligence technologies, including deep learning algorithms, in recent years. Concerns about the increasing ease of creation and credibility of deepfakes have then been growing more and more, prompting researchers around the world to concentrate their efforts on the field of deepfake detection. In this same context, researchers at ISTI-CNR's AIMH Lab have conducted numerous researches, investigations and proposals to make their own contribution to combating this worrying phenomenon. In this paper, we present the main work carried out in the field of deepfake detection and synthetic content detection, conducted by our researchers and in collaboration with external organizations.Source: Ital-IA 2023, pp. 432–436, Pisa, Italy, 29-31/05/2023
Project(s): AI4Media via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA


2023 Conference article Open Access OPEN
An optimized pipeline for image-based localization in museums from egocentric images
Messina N., Falchi F., Furnari A., Gennaro C., Farinella G. M.
With the increasing interest in augmented and virtual reality, visual localization is acquiring a key role in many downstream applications requiring a real-time estimate of the user location only from visual streams. In this paper, we propose an optimized hierarchical localization pipeline by specifically tackling cultural heritage sites with specific applications in museums. Specifically, we propose to enhance the Structure from Motion (SfM) pipeline for constructing the sparse 3D point cloud by a-priori filtering blurred and near-duplicated images. We also study an improved inference pipeline that merges similarity-based localization with geometric pose estimation to effectively mitigate the effect of strong outliers. We show that the proposed optimized pipeline obtains the lowest localization error on the challenging Bellomo dataset. Our proposed approach keeps both build and inference times bounded, in turn enabling the deployment of this pipeline in real-world scenarios.Source: ICIAP 2023 - 22nd International Conference on Image Analysis and Processing, pp. 512–524, Udine, Italy, 11-15/09/2023
DOI: 10.1007/978-3-031-43148-7_43
Project(s): AI4Media via OpenAIRE
Metrics:


See at: IRIS - Università degli Studi di Catania Open Access | ISTI Repository Open Access | doi.org Restricted | CNR ExploRA


2023 Report Open Access OPEN
AIMH Research Activities 2023
Aloia N., Amato G., Bartalesi V., Bianchi L., Bolettieri P., Bosio C., Carraglia M., Carrara F., Casarosa V., Ciampi L., Coccomini D. A., Concordia C., Corbara S., De Martino C., Di Benedetto M., Esuli A., Falchi F., Fazzari E., Gennaro C., Lagani G., Lenzi E., Meghini C., Messina N., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Puccetti G., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C., Versienti L.
The AIMH (Artificial Intelligence for Media and Humanities) laboratory is dedicated to exploring and pushing the boundaries in the field of Artificial Intelligence, with a particular focus on its application in digital media and humanities. This lab's objective is to enhance the current state of AI technology particularly on deep learning, text analysis, computer vision, multimedia information retrieval, multimedia content analysis, recognition, and retrieval. This report encapsulates the laboratory's progress and activities throughout the year 2023.Source: ISTI Annual Reports, 2023
DOI: 10.32079/isti-ar-2023/001
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2023 Conference article Open Access OPEN
VISIONE for newbies: an easier-to-use video retrieval system
Amato G., Bolettieri P., Carrara F., Falchi F., Gennaro C., Messina N., Vadicamo L., Vairo C.
This paper presents a revised version of the VISIONE video retrieval system, which offers a wide range of search functionalities, including free text search, spatial color and object search, visual and semantic similarity search, and temporal search. The system is designed to ensure scalability using advanced indexing techniques and effectiveness using cutting-edge Artificial Intelligence technology for visual content analysis. VISIONE was the runner-up in the 2023 Video Browser Showdown competition, demonstrating its comprehensive video retrieval capabilities. In this paper, we detail the improvements made to the search and browsing interface to enhance its usability for non-expert users. A demonstration video of our system with the restyled interface, showcasing its capabilities on over 2,300 hours of diverse video content, is available online at https://youtu.be/srD3TCUkMSg.Source: CBMI 2023 - 20th International Conference on Content-based Multimedia Indexing, pp. 158–162, Orleans, France, 20-22/09/2023
DOI: 10.1145/3617233.3617261
Project(s): AI4Media via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2022 Conference article Open Access OPEN
AIMH Lab for Trustworthy AI
Messina N., Carrara F., Coccomini D., Falchi F., Gennaro C., Amato G.
In this short paper, we report the activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR related to Trustworthy AI. Artificial Intelligence is becoming more and more pervasive in our society, controlling recommendation systems in social platforms as well as safety-critical systems like autonomous vehicles. In order to be safe and trustworthy, these systems require to be easily interpretable and transparent. On the other hand, it is important to spot fake examples forged by malicious AI generative models to fool humans (through fake news or deep-fakes) or other AI systems (through adversarial examples). This is required to enforce an ethical use of these powerful new technologies. Driven by these concerns, this paper presents three crucial research directions contributing to the study and the development of techniques for reliable, resilient, and explainable deep learning methods. Namely, we report the laboratory activities on the detection of adversarial examples, the use of attentive models as a way towards explainable deep learning, and the detection of deepfakes in social platforms.Source: Ital-IA 2020 - Workshop su AI Responsabile ed Affidabile, Online conference, 10/02/2022

See at: ISTI Repository Open Access | www.ital-ia2022.it Open Access | CNR ExploRA


2022 Conference article Open Access OPEN
AIMH Lab for Cybersecurity
Vairo C., Coccomini D. A., Falchi F., Gennaro C., Massoli F. V., Messina N., Amato G.
In this short paper, we report the activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR related to Cy-bersecurity. We discuss about our active research fields, their applications and challenges. We focus on face recognition and detection of adversarial examples and deep fakes. We also present our activities on the detection of persuasion techniques combining image and text analysis.Source: Ital-IA 2022 - Workshop su AI per Cybersecurity, 10/02/2022

See at: ISTI Repository Open Access | www.ital-ia2022.it Open Access | CNR ExploRA


2022 Conference article Open Access OPEN
AIMH Lab: Smart Cameras for Public Administration
Ciampi L., Cafarelli D., Carrara F., Di Benedetto M., Falchi F., Gennaro C., Massoli F. V., Messina N., Amato G.
In this short paper, we report the activities of the Artificial Intelligence for Media and Humanities (AIMH) laboratory of the ISTI-CNR related to Public Administration. In particular, we present some AI-based public services serving the citizens that help achieve common goals beneficial to the society, putting humans at the epicenter. Through the automatic analysis of images gathered from city cameras, we provide AI applications ranging from smart parking and smart mobility to human activity monitoring.Source: Ital-IA 2022 - Workshop su AI per la Pubblica Amministrazione, Online conference, 10/02/2022

See at: ISTI Repository Open Access | www.ital-ia2022.it Open Access | CNR ExploRA


2022 Doctoral thesis Open Access OPEN
Relational Learning in computer vision
Messina N.
The increasing interest in social networks, smart cities, and Industry 4.0 is encouraging the development of techniques for processing, understanding, and organizing vast amounts of data. Recent important advances in Artificial Intelligence brought to life a subfield of Machine Learning called Deep Learning, which can automatically learn common patterns from raw data directly, without relying on manual feature selection. This framework overturned many computer science fields, like Computer Vision and Natural Language Processing, obtaining astonishing results. Nevertheless, many challenges are still open. Although deep neural networks obtained impressive results on many tasks, they cannot perform non-local processing by explicitly relating potentially interconnected visual or textual entities. This relational aspect is fundamental for capturing high-level semantic interconnections in multimedia data or understanding the relationships between spatially distant objects in an image. This thesis tackles the relational understanding problem in Deep Neural Networks, considering three different yet related tasks: Relational Content-based Image Retrieval (R-CBIR), Visual-Textual Retrieval, and the Same-Different tasks. We use state-of-the-art deep learning methods for relational learning, such as the Relation Networks and the Transformer Networks for relating the different entities in an image or in a text.

See at: etd.adm.unipi.it Open Access | ISTI Repository Open Access | CNR ExploRA


2022 Conference article Open Access OPEN
Combining EfficientNet and vision transformers for video deepfake detection
Coccomini D. A., Messina N., Gennaro C., Falchi F.
Deepfakes are the result of digital manipulation to forge realistic yet fake imagery. With the astonishing advances in deep generative models, fake images or videos are nowadays obtained using variational autoencoders (VAEs) or Generative Adversarial Networks (GANs). These technologies are becoming more accessible and accurate, resulting in fake videos that are very difficult to be detected. Traditionally, Convolutional Neural Networks (CNNs) have been used to perform video deepfake detection, with the best results obtained using methods based on EfficientNet B7. In this study, we focus on video deep fake detection on faces, given that most methods are becoming extremely accurate in the generation of realistic human faces. Specifically, we combine various types of Vision Transformers with a convolutional EfficientNet B0 used as a feature extractor, obtaining comparable results with some very recent methods that use Vision Transformers. Differently from the state-of-the-art approaches, we use neither distillation nor ensemble methods. Furthermore, we present a straightforward inference procedure based on a simple voting scheme for handling multiple faces in the same video shot. The best model achieved an AUC of 0.951 and an F1 score of 88.0%, very close to the state-of-the-art on the DeepFake Detection Challenge (DFDC). The code for reproducing our results is publicly available here: https://tinyurl.com/cnn-vit-dfd.Source: ICIAP 2022 - 21st International Conference on Image Analysis and Processing, pp. 219–229, Lecce, Italy, 23-27/05/2022
DOI: 10.1007/978-3-031-06433-3_19
Metrics:


See at: ISTI Repository Open Access | doi.org Restricted | link.springer.com Restricted | CNR ExploRA